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In this paper we propose a recursive online algorithm for estimating the parameters of a time- 
varying ARCH process. The estimation is done by updating the estimator at time point t — 1 
with observations about the time point t to yield an estimator of the parameter at time point t. 
The sampling properties of this estimator are studied in a non- stationary context - in particular, 
asymptotic normality and an expression for the bias due to non-stationarity are established. By 
running two recursive online algorithms in parallel with different step sizes and taking a linear 
combination of the estimators, the rate of convergence can be improved for parameter curves 
from Holder classes of order between 1 and 2. 
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1. Introduction 

The class of autoregressive conditional heteroscedastic (ARCH) processes can be gener- 
alized to include non-stationary processes, by including models with parameters which 
are time-dependent. More precisely, {X t ^} is called a time-varying ARCH (tvARCH) 
process of order p if it satisfies the representation 



X, 



t.N ■ 



Zt<Jt.N, <?l N = a ( — j + a 3 ( ft ) X t-j,N, (!) 



where {Zt} are independent, identically distributed random variables with K(Zq) = and 
E(Zq) = 1. This class of tvARCH processes was investigated in Dahlhaus and Subba Rao 
[4]. It was shown that it can locally be approximated by a stationary process; we sum- 
marize the details below. Furthermore, a local quasi-likclihood method was proposed to 
estimate the parameters of the tvARCH (p) model. 

A potential application of the tvARCH process is to model long financial time series. 
The modelling of financial data using non-stationary time series models has recently 
attracted considerable attention. A justification for using such models can be found, for 
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example, in Mikosch and Starica [9, 10] . However, given that financial time series are often 
sampled at high frequency, evaluating the likelihood as each observation comes online 
can be computationally expensive. Thus an 'online' method, which uses the previous 
estimate of the parameters at time point t — 1 and the observation at time point t to 
estimate the parameter at time point t would be ideal and cost-effective. There exists a 
huge literature on recursive algorithms, mainly in the context of linear systems (cf. Ljung 
and Soderstrom [8]; Solo [12, 13]) or neural networks (cf. White [15]; Chen and White 
[2]). For a general overview, see also Kushner and Yin [7]. Motivated by the least mean 
squares algorithm in Moulincs et al. [11] for time- varying autorcgrcssivc processes, we 
consider in this paper the following recursive online algorithm for tvARCH models: 

^ 1 ./v 

a t ,N=&t-l,N + K X t,N-^t-l,N^t-l,N} TTT 1 ' |2 , t=(p+l),...,N, (2) 

with Xtl 1N = (l,Xf_ 1N ,...,Xf_ pN ), |A?t_x,jv|i = 1 +Y%=i x t-j,N and imtia l condi- 
tions a p N — (0, . . . , 0). This algorithm is linear in the estimators, despite the nonlinear- 
ity of the tvARCH process. We call the stochastic algorithm defined in (2) the ARCH 
normalized recursive estimation (ANRE) algorithm. Let a(u) T = (ao(u), . . . , a p (u)); then 
a t N is regarded as an estimator of a(t/N) or of a(u) if \t/N — u\ < 1/N. 

In this paper we will prove the consistency and asymptotic normality of this recursive 
estimator. Furthermore, we will discuss the improvements of the estimator obtained 
by combining two estimates from (2) with different A. Unlike in most other work in 
the area of recursive estimation the properties of the estimator are proved under the 
assumption that the true process is a process with time-varying coefficients, that is, 
a non-stationary process. The rescaling of the coefficients in (1) to the unit interval 
corresponds to the 'infill asymptotics' in nonparametric regression: as N — > oo the system 
does not describe the asymptotic behaviour of the system in a physical sense, but is meant 
as a meaningful asymptotics to approximate, for example, the distribution of estimates 
based on a finite sample size. A similar approach was used in Moulines et al. [11] for 
time-varying autoregressive models. A more detailed discussion of the relevance of this 
approach and the relation to non-rescaled processes can be found in Section 3. 

In fact the ANRE algorithm resembles the NLMS algorithm investigated in Moulines 
et al. [11]. Rewriting (2), we have 

a t ,jV=U- A — rp (I— K-i,iV + A-pr; 75-. (3) 

We can see from (3) that the convergence of the ANRE algorithm relies on showing some 
type of exponential decay of the past. In this paper we will show that for any p > 0, 

p 

<K{1-X5) k for some 5 > 0, (4) 

where || ■ || denotes the spectral norm and n"=o^« = Aq-- ■ A n . Roughly speaking, this 
means we have, on average, exponential decay of the past. Similar properties are often 



i=l\ \Xt-i-l,N\\ J 
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established in the control literature and referred to as persistence of excitation of the 
stochastic matrix (see, for example, Guo [5]; Aguech et al. [1]), which in our case is the 
matrix (1 — ly — rr<%t-i AT^Ti n)- Persistence of excitation guarantees convergence of 
the algorithm, which we will use to prove the asymptotic properties of a t N . 

In Section 2 we state all results on the asymptotic behaviour of a t N including con- 
sistency, asymptotic normality and rate efficiency. Furthermore, we suggest a modified 
algorithm based on two parallel algorithms. In Section 3 we discuss practical implications. 
Sections 4 and 5 contain the proofs, which in large part are based on the perturbation 
technique. Some technical methods have been gathered in the Appendix. We note that 
some of results in the Appendix are of independent interest, as they deal with the prob- 
abilistic properties of ARCH and tvARCH processes and their vector representations. 

2. The ANRE algorithm 

We first review some properties of the tvARCH process. Dahlhaus and Subba Rao [4] 
and Subba Rao [14] have shown that the tvARCH process can be locally approximated 
by a stationary ARCH process. Let u be fixed and 

v 

X t {u) = Z t a t (u), cr t (u) 2 = a (u) + ^a :j (u)X t ^ j (u) 2 , (5) 

3=1 

where {Z t } are independent, identically distributed random variables with E(Zo) = 
and E(Zq) = 1. We also set X t (u) T = (l,X t (u) 2 , . . . , X t _ p+1 (u) 2 ). In Lemma 4.1 we show 
that X t {u) 2 can be regarded as the stationary approximation of X 2 N around the time 
points t/N as u. 

Assumption 2.1. Let {X tt N} and {X t (u)} be sequences of stochastic processes which 
satisfy (1) and (5) respectively. 

(i) For some r G [1, oo), there exists r\ > such that {E(Zq ? ')} 1 ' 1 ' sup u {J^ =1 aj(u)} < 

1-7?. 

(ii) There exists < p\ < p2 < oo such that, for all u <G (0, 1], pi < ao(u) < p2- 

(iii) There exists (3 £ (0, 1] and a constant K such that for (0, 1], 

\a 3 (u) - a 3 (v)\ <K\u-vf for j = 0, . . .,p. 

(iv) Let r (u)-ao(w)Z 2 and Y t (u) = {a Q (u) + £j. =1 a 3 {u)Y t ^ 3 {u)}Z 2 (t = l,...,p). 
Define Y p {u) = (l,Y 1 (u),...,Y p (u)) T and E(u) = E(Y p (u)Y p (u) J ') . Then there 
exists a constant C such that mi u A m i n {E(u)} > C. 

Remark 2.1. It is clear that £(it) is a positive semi-definite matrix, hence its smallest 
eigenvalue is greater than or equal to zero. It can be shown that if p/E(Z t 4 ) 1/2 < 1 and 
sup„ a (u) > 0, then A min (S(w)) > (l-p/E(Z t Y /2 )/a (it) (2p+1)/(p+1) - However, this con- 
dition is only sufficient and lower bounds can be obtained under much weaker conditions. 
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We now investigate the asymptotic properties of the ANRE algorithm. We assume 
that A — > and AiV — > oo as N — ► oo. We mention explicitly that A does not depend on 
t. that is, we are considering the fixed-step-size case. The assumption A — > is possible 
in the triangle array framework of our model and the resulting assertions (e.g., Theorem 
2.2) are meant as an approximation of the corresponding finite sample size distributions 
and not as the limit in any physical sense. 

The following results are based on a representation proved at the end of Section 4.3. 
The difference a t N — a(uo) is dominated by two terms, that is, 



where 



a(u ) = C to (u ) + n t0yN (uo) +O p (S N ), (6) 



\{N\) 2 P (NX)' 3 

t — p— 1 

A M= £ \{I-\F(u )} k M to - k (u ), (8) 

fc=0 

K i0 ,«K)= °2 A{/-A^( Wo )} fe ({x to -^^)-X to - fc K)| 



with 



We note that Ct (uo) and lZt .N(uo) play two different roles. Ct (uo) is the weighted sum 
of the stationary random variables {X t (uo)}t, which locally approximate the tvARCH 
process {X t ,N}t, whereas lZt ,N{uo) is the (stochastic) bias due to non-stationarity; if the 
tvARCH process were stationary this term would be zero. It is clear from the above that 
the magnitude of TZt .N( u o) depends on the regularity of the time-varying parameters 
a(u), for example, the Holder class that a(u) belongs to. By using (6) we are able to 
obtain a bound for the mean squared error of a to N . Let | • | denote the Euclidean norm 
of a vector. 



Theorem 2.1. Suppose Assumption 2.1 holds with r > 4 and uq > 0. Then if \uo — 
t /N\ < l/N, we have 

E{|a t0iJ v-aM| 2 } = o(A+^^), (10) 
where A — > as N — > 00 and NX ^> (log N) 1+e , with some e > 0. 
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The proof can be found at the end of Section 4.2. Theorem 2.1 implies a tQ N — ► cl(uq). 
The stochastic term £ to ( u o) is the sum of martingale differences, which allows us to 
obtain the following central limit theorem, whose proof is at the end of Section 4.3. 

Theorem 2.2. Suppose Assumption 2.1 holds with r > 4 and uq > 0. Let 7Zt ,N(uo) be 
defined as in (8). If \t Q /N - u \ < l/N 

(i) and A > jV" 4 ' 3 /^ 1 ) and A > N~ 2 P , then 

\- 1/2 {a totN -a(u )}-X- 1/2 nt , N (uo)^^(0,E(uo)y, (11) 

(ii) and A > Ar- 2 /V(2/m) 

A- 172 ^,^ - a(«o)} " AA(0, S(u )), (12) 
where A — > as N — > oo and iVA 3> (logiV) +e , /or some e > 0, with 

M«) = y f W |Ab(u)|f J' /M = E(Z )-1. (13) 

Until now we have assumed o(u) € Lip(/3), where /3 < 1. Let /(it) denote the derivative 
of the vector or matrix /(•) with respect to u. Suppose < (3' < 1 and a(u) € Lip(/3'); 
then we say a(u) € Lip(l + /?'). We now show that an exact expression for the bias can 
be obtained if a(u) € Lip(l + /?') and /?' > 0. We make the following assumptions. 

Assumption 2.2. Let {X t w} be a sequence of stochastic processes which satisfies As- 
sumption 2.1 and \hi(u) — cn{v)\ < K\u — (i = 0, . . . ,p) for some (3 1 > 0. 

Under this assumption we show in Lemma 5.3 that 

m t0 ,N - a(«o)} = (no)- 1 a(u Q ) + O ( ^yi+gr ) • ( 14 ) 

We note that typically it is not possible to obtain an exact expression for the bias of 
parameter estimates of an ARCH process. By using the expression above for the bias we 
obtain the following theorem, whose proof is at the end of Section 5. Let tr(-) denote the 
trace of the matrix. 



Theorem 2.3. Suppose Assumption 2.2 holds with r > 4 and u$ > 0. Then if \to/N • 
uo I < we have 



E|a t0iJV -o(u )| 2 = Atr{EK)} + ^^|F( Uo ) ^(uo) 



1 A 1 / 2 1 1 
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and if X is such that X' 1 / 2 /(NX) 1+f3 ' -> 0, then 

A- 1/2 (a to , JV -«K)) + A- 1 / 2 ^ x F( Wo )- 1 aK)^AA(0,I]( U o)), (16) 
where A — > as iV—> oo and AA" 3> (log A^) 1+£ , /or some e > 0. 
Let ||/ 1|/3 be the bounded Lipschitz norm and 

C fi (L) = (/(•) = (/o(-), • ■ • , /p+i(-))|/: [0, 1] - H/ll/3 < L, 

p+i ^ 

supVVj('u) <l-£,0<Pi <inf/ (u) <p 2 <oo >. 
1=1 ) 



(17) 



In Dahlhaus and Subba Rao [3] we derived the following minimax risk for estimators 
a to ,N of a(u ): 

min max E\a tQ N - a{u )\ 2 > KN~ 2u/{2u+1) . (18) 

a t n £ct(X 1iN ,...,Xn,n) o(uo)eC"(I/) 

Comparing this bound with (10), it is straightforward to show that the ANRE algorithm 
attains the optimal rate if a(-) G Lip(^) with f < 1 (with A = N^ 2v ^ 1+2lJ ^). It is a different 
story when 1 < v < 2. If a(-) € Lip(l + /?'), /?' > 0, the mean squared error of the ANRE 
estimator in (15) becomes minimal for A sa jV~ 2 / 3 with minimum rate E|a t jy ~fl( u o)| 2 = 
0(A/'- 2 / 3 ). However, in (18) the minimax rate in this case is j\f- 2 ^ +l3 ">/ ( - 1+2 ^ 1+l3 ">\ which 
is smaller than N~ 2 / 3 . We now present a recursive method which attains the optimal 
rate. 

Remark 2.2 Bias correction, rate optimality . The idea here is to achieve a bias correc- 
tion and the optimal rate by running two ANRE algorithms with different step sizes Ai 
and A2 in parallel. Let a t N (Xi) and a t jv(A2) be the ANRE algorithms with step size Ai 
and A2 respectively, and assume that Ai > A2. By using (14) for i = 1,2, we have 



E{a t0)JV (Ai)} = o(«o) - J^ F ( u o) ^W+ Q ( (jvA,f+y )' 



(19) 



Since a(uo) — (NXi) 1 F(uq) l a(uo) w a(uQ — (NXi) 1 F(uo) 1 ), we heuristically estimate 
a{uQ — (NXi) F( u o)~ 1 ) instead of a(uo) by the algorithm. By using two different Ai we 
can find a linear combination of the corresponding estimates such that we 'extrapolate' 
the two values a(uo) — (NXi)~ 1 F(uo)~ 1 a(uo) (i = 1, 2) to a(uo). Formally, let < w < 1, 
A2 = wXi and 

1 w 

^o.jvH = Y3^Sto,w(Ai) - Y3^ ,mA 2 ). 
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If \t /N — uq\ < 1/N, then by using (19) we have 

E {flto,JvM} =a(uo) +0 
By using Propostion 4.3 we have 

m t0 , N - «K)i 2 = o (x + (JVA) 2 (1+/3 ,) ) . 

and choosing A = const, x N~ ( - 2+2/3 )/( 3 + 2 ^ ) gives the optimal rate. There remains the 
problem of choosing A (and w). It is obvious that A should be chosen adaptively to 
the degree of non-stationarity. That is, A should be large if the characteristics of the 
process are changing more rapidly. However, a more specific suggestion would require 
more investigations - both theoretically and by simulation. 

The above method cannot be extended to higher-order derivatives, since the other 
remainders are of a lower order (iVA) -2 (see (15) and the proof of Theorem 2.3). 

Finally, we mention that choosing A2 < wXi will lead to an estimator of a(uo + A) with 
some A > (with rate as above). This could be the basis for the prediction of volatility 
of tvARCH processes. □ 



[(NX) 



1+/3' 



3. Practical implications 

Suppose that we observe data from a (non-rescaled) tvARCH process in discrete time 

v 

X t = Z t o t , *t=ao{t)+^a j {t)Xl_ i , teZ. (20) 

3=1 

In order to estimate a(t) we use the estimator a t as given in (2) (with all subscripts N 
dropped). An approximation for the distribution of the estimator is given by Theorem 2.2. 
Theorem 2.2(h) can be used directly since it is completely formulated without N. The 
matrices F(uq) and H(ito) depend on the unknown stationary approximation X t (uo) of 
the process at uq = to /N , that is, at time to in non-rescaled time. Since this approximation 
is unknown we may instead use the process itself in a small neighbourhood of to, that is, 
we may estimate, for example, F(uo) by 

with m small and X t r _ 1 = (1, Xf_ 1 , . . . ,Xf_ p ). An estimator which fits the recursive al- 
gorithm better is 

[i - (i - xyo-p+r 1 1: Aa - Av *;°- J *y . 

J~o l^to-ili 
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In the same way we can estimate £(i*o) which altogether leads, for example, to an 
approximate confidence interval for a t . In a similar way Theorem 2.2(i) can be used. 

The situation is more difficult with Theorems 2.1 and 2.3, since here the results depend 
(at first sight) on N. Suppose that we have parameter functions dj(-) and some TV > to 
with a.j(to/N) = CLjito) (i.e. the original function has been rescaled to the unit interval). 
Consider Theorem 2.3 with the functions dj(-). The bias in (14) and (15) contains the 
term 

^ i a 3 -(Wi\o-a 3 -((fo-i)/iv) 

jy«iK) « = a ^ °> ~ ^ ° ~ >' 

which again is independent of N. To avoid confusion we mention that N~ 1 dj (uq) of course 
depends on N once the function %(■) has been fixed (as in the asymptotic approach of 
this paper) but it does not depend on N when it is used to approximate the function a,j (t) 
since then the function %•(•) is a different one for each N. In the spirit of the remarks 
above we would, for example, use the expression 

[1 - (1 - A)*"-** 1 ]- 1 J] A(l - A)>, (i ) - %(i - j)] 

3=0 

as an estimator of N~ 1 dj(uo) in (14) and (15). 

These considerations also demonstrate the need for the asymptotic approach of this 
paper. While it is not possible to set down a meaningful asymptotic theory for the 
model (20) and to derive, for example, a central limit theorem for the estimator a t , the 
approach of the present paper for the rescaled model (1) leads to such results. This is 
achieved by the 'infill asymptotics' where more and more data become available for each 
local structure (e.g. about time uq) as N — > oo. The results can then be used also for 
approximations in the model (20) - for example, for confidence intervals. 

4. Proofs 

4.1. Some preliminary results 

In the next lemma we give a bound for the approximation error between N and X t (u) 2 . 
The proofs of these results and further details can be found in Dahlhaus and Subba Rao 
[4] ; see also Subba Rao [14] . 

Lemma 4.1. Suppose Assumption 2.1 holds with some r> 1. Let {Xt t N} and {Xt(u)} 
be defined as in (1) and (5). Then we have: 

(i) {X t (u) 2 } t is a stationary ergodic process. Furthermore, there exists a stochastic 
process {Vt,N}t and a stationary ergodic process {Wt}t with sup t N 
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and E(|Wt| r ) < oo, such that 



1 



\xi N -x t ( u y\< w v ttN + 

\X t {uf - X t {vf\ < \u-vfW t . 
(ii) sup t N E(X? r N ) < oo and swp u E(X t (u) 2r 



t 

N 



(21) 
(22) 



< oo. 



We now define the derivative process by {Xf(u)}t, which is almost surely the derivative 
of the squared ARCH process {X t (u) 2 } t (i.e., X 2 {u) = (dX t {u) 2 / du)\). 

Lemma 4.2. Suppose Assumption 2.2 holds with some r > 1. Then the derivative process 
{X 2 (u)} t is a well-defined stationary ergodic process which satisfies the representation 



and 



x 2 (u) = | do (u) + J2 K- («)*t-i («) + («) 2 ] I 3 



|X|( U )-X t 2 («)|<| U -^'W t) 



(23) 



where Wt is the same as in Lemma 4.1. Almost surely all paths of {X t (u) 2 } u belong to 
Lip(l) and we have the Taylor series expansion 



X 



-x t ( U y + (--u)xt(u) + 



N 



N 1 



where |i?t,jv|i < (Vt,N + W t ). Furthermore, snp N X 2 N < W t , sup u X t (u) 2 < W t and 
swp u \X 2 (u)\ < Wt with bounded norms sup u E(|A t 2 (u)| r ) < oo and sup t N ¥,(\R t .N\ r ) < 
oo. 



Let T t = o{Z 2 ,Z 2 ^ . . .). We have T t = o{X 2 N ,X 2 _ lN , ...)= <j(X t {u) 2 ,X t ^{u) 2 , . . .), 
since a Volterra expansion gives X 2 N in terms of {Zf} t and the ARCH equations give Z 2 
in terms of {Xf N } t . We now consider the mixing properties of functions of the processes 
{Xt,N}t and {X t (uj}f The proof of the proposition below can be found in Section A.l. 

Proposition 4.1. Suppose Assumption 2.1 holds with r = l. Let {X tt N} o,nd {X t (u)} 
be defined as in (1) and (5), respectively. Then there exists a (1 — rf) < p < 1 such that 
for any 4> & Lip(l), 

\E[<f>{Xt tN )\^ t -k] ~ E[0(*t,Ar)]li < K P k ( l + l^t-fe,jv|i), (24) 
|E[^(Ai(«))|^t-fc]-E[0(Aii(tt))]| 1 <li:p*(l + |Ai_ fc («)|i) ) (25) 
|E[^(Af 4)iV )|^ t _ fc ]| 1 <ir(l+p fe |Af t _ fc , JV | 1 ); (26) 
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if j, k > then 

\E[<f>{X tiN )\F t - k ] - E[^(A^.jv) |^t-fc-j] li < Kp k {\X t ^ KN \ 1 + \Xt-H- jt jv|i); (27) 
and if Assumption 2.1 /io/rfs wrai/i some r > 1 i/ien /or 1 < g < r we have 

E(\X tjN \l\F t _ k )<K\X t _ ktN \ q v (28) 
where the constant K is independent of t,k,j and N. 

The corollary below follows by using (24) and (25). 

Corollary 4.1. Suppose Assumption 2.1 holds. Let {A^at} and {Xt(u)} be defined as 
in (1) and (5) respectively, and <f> £ Lip(l). Then {<f>{X t .N)} o-nd {(j}(X t (u))} are L q - 
mixingales of size — oo. 

4.2. The pertubation technique 

In this section we use the pertubation technique, introduced in Aguech et al. [1], to 
show consistency of the ANRE estimator. To analyse the algorithm we compare it with a 
similar one driven by the true parameters where X t ?j has been replaced by the stationary 
process X t {u). Let 8t y N{u) = a t N — a(u), 



M, N = (Z? 1)^ Mt (u) = (Z? - l)a t {uf £-fL 

|<-li-l,jv|i l<*t-lWll 



X t , N X tN X t (u)X t (u) T 

Ft ' N = l^F' Ft{u) = ' (30) 

and F(u) be defined as in (9). The 'true algorithm' is 

a(u) = a{u) + \{X t (u) 2 - a(u) T X t -i(u)} , - XM t (u). (31) 

An advantage of the specific form of the random matrices Ft t N is that |-Ft,Ar|i < (p+ l) 2 - 
This upper bound will make some of the analysis easier to handle. 
By subtracting (31) from (2) we obtain 

5t, N (u) = (I- \F t -i,N)St-iAv) + ^t, N (u) + XM t (u), (32) 

where 

BtMu) = {M t , N -M t {u)} + F t _ ltN ^a(J^\ -o(«)|. (33) 

We note that (32) can also be considered as a recursive algorithm, thus we call (32) the 
error algorithm. There are two terms driving the error algorithm: the bias Bt^N^u) and 
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the stochastic term Ai t (u). Because the error algorithm is linear with respect to the 
estimators we can separate St^iu) in terms of the bias and the stochastic terms: 



where 



6% N (u) = {I- \F t _ hN )S?_ hN (u) + XB t , N (u), 
6™ N (u) = (I-\F t _ hN )5?L hN (u) + \M t (u), 
6£ N (u) = (I-\F t - llN )6?_ ltN (u), 
We have for y £ {B, M, R}, 



6% N (u) = 0, 
S^ N (u)=0, 
S p,n( u ) = -a(u). 



t-p-l fk-l 



Ni 



t-k,N 



(«)» 



(34) 
(35) 
(36) 



(37) 



k=0 K. i=0 



where Vf N (u) = \B t . N (u), T)f' N (u) = XM t (u), V* N (u) = if t > p and V* N (u) = -a(u). 

For Sf N (u) to converge, it is clear that the random product ni=o(-^ — XF t - i-i,jv) must 
decay. Technically this is one of the main results. It is stated in the following theorem. 

Theorem 4.1. Suppose Assumption 2.1 holds with r = 4 and N is sufficiently large. 
Then for all q>\ and (jp + 1) < t < N there exists M > 0, <5 > such that 



fe-i 



i=0 



III 1 -* 



^-i-l,if^-j-l,jf 



<Mexp{-<5Afc}. 



(38) 



I'ff-i-l.jvll 

Under Assumption 2.1 and by using the proposition above, there exists a <5 > with 



(E|<5* W'KMeM-SXt}, 



(39) 



for 1 < q < r and t = p + 1, . . . , N . Therefore this term decays exponentially and, as we 
shall see below, is of lower order than Sf N (u) and 5^ N (u). 

We now study the stochastic bias at to with \to/N — uq\ < 1/N. It is straightforward 



to see that 5f N (u) = 8^'^(v) + S^'^(u) + S^(u), where 



2,B, 



= ( 7 - \F t ^ N )6 L t f 1N (u) + \{M t ,N - M t (u)}, 
= (I-\F t _ liN )5ll N (u) + \F(u)L(jj) -o(«)}, 

CjvW = (I-^ hN )Sl%(u) + X(F t ^ N -F(u))^a(j^ -a 



(40) 



(«) 
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°p,N 



0, Sp'x = and S 3 '^ = 0. In order to obtain asymptotic expressions for the expec- 



J p,N 



tation of each component S^ B N , 5^ B N , Sf^ B N and 5^ N we use the pertubation technique 
proposed by Agucch et al. [1]. For x = M, (1, B), (2, B), we can decompose the stochastic 
and bias terms as follows: 



where 



(41) 



SK) = (I - XF(uo)) Jfl\ v(uo) + G? v, 



Jt'N&o) = (I- ^ F {u ))J^ hN {u ) - \{Ft—l,N - F M}J?l hN (uo), 

H* N (u ) = (I - AF t _i,jv)fff_i iJV (wo) - A{JI-i,jv - F( Uo )} J^IatK), 
with, for t <U), 



(42) 



Gf5v = XM t (u ), 



Gf B = F(u ) 



— a(u) 



G\<° = X[Mt, N ~ M t (u )] 
G 3 t: B = (F t ^, N -F(u )) 



— a(u) 



(43) 



Furthermore, Jp'^r(uo) = Jp' N (uo) = H* n {uq) = 0. Equation (41) can easily be derived 
by taking the sum on both sides of the three equations in (42) . In the proposition below 
we will show that J^' n (uq) is the principal term in the expansion of 5^ o N (uo), that is, 
with S'f 0tN (uo) f» J^' jv(mo)- Substituting (43) into (42) gives 

*o-p— i 

Jt 1 ;N'\u )= J2 Hl-^F(u )} k {M t0 - k . N ~M to - k (u )}, 



k=0 
to— P— 1 



J%$ ) ' 1 (u ) = E {/-AFK)} fe F( Uo ){a 
fe=i 



AT 



^ ) ' 1 («o)= E {I- AF(w )} fc (^t-ft-i,JV --^(wo))|a^^3^ -ffi(«o) j- 

In the proof below we require the following definition 

p-i 

A,AT = $^[Vt-i,tf + Wi-i], fort>p, 
where Vt jv and are defined as in Lemma 4.1. 



a(uo) j, 



(44) 



(45) 



Proposition 4.2. Suppose Assumption 2.1 holds with r > 4, and Zei <5^ ^(uo), 
^to.iV^fao); J to',N ( u o)> ^to.'iV 5 ' 1 ^); fre defined as in (34) and (44). TTiera /or 
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\t /N-u \ < 1/N and XN > (\ogN) 1+£ , for some e>0, we have 

(i) (E| J^'^uo) + = O , (46) 

(ii) W« ) ' 1 WI^ = o(^ + |^) ) 

(iii) (®K, N (uo) - j£;j?'\uo) - j£;iP'\«o)\ 2 ) 1/2 = O + + a) . (47) 
Proof. Wc first prove (i). Let us consider '^'' X {uq). By using (105) we have 



\M t , N -M t (u )U<\Z? + l\i-^ + 



P 



(i, 

t , TV 



N 



"0 



D 



t.N- 



Therefore, by substituting the above bound into J^'n (ito) and by using (99), we have 

(E|^W) 1/r 



E A(l-5A) fe {l + / + ^}(E^ 2 + ir) 1/r (E|A - fc -i,ivr) 1/r - 



to— p— 1 



k=0 



Now by using Lemma 4.1(i) we have that sup t N \D t .N\ r < oo. Furthermore, from Lemma 
C.3 in Moulincs et al. [11] we have 



N 



]T(i - \) k kP < x- 1 - 



By using the above we obtain 



{n4lf'\u )\ r ) l,r < Xsup(E|A, W r) 1/l '(E|Z 2 + ir) 1/r 7^ 



O 



t,N 



(NX)P \(NX)P 



(48) 



(49) 



We now bound (E| J t ( o 2 ^ ) ' 1 (u )| r ) 1/r - Since a(u) E Lip(/3), by using (99) and (48) we have 



So— p— 1 



k=l 



(XN)P 



(50) 



Thus (49) and (50) give the bound (46), which completes the proof of (i). 

To prove (ii), we now bound (E| J t ^' i ^' l ' 1 (wo)| r ) 1 / r . We first observe that jj^'x (^o) 
can be written as 

J to',N ' X ( U o) = h ,N + H to ,N, 
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where 



to-p-l 



It ,N = 



II 



to, AT 



J2 {I- \F(u )} k {F t ^ hN - F t _fc_i(tio)){o 

k=l 

to—p-X , 

{I- AF(u )} fc (F t _ fe _ 1 (u ) - F(u ))l a 

k=X ^~ 



tp-k 
N 



-a(u ) 



tp-k 
N 



- a(u ) 



Using the same proof as for jj^'jP 11 (uo), it is straightforward to show that (E|2 t0i jv| r ) 
0((7VA)~ 2/3 ). In order to bound II to ,N, let F k (u ) = F k (up)-F(u ) and (j>[x) = xx T /\x\j, 
where x = (l,x±, ... ,x p ); then <f)(x) € Lip(l) and Ft t N = <f>(Xt,N)- Since <f> G Lip(l), by 
using Corollary 4.1 we have that F t (uo) — F(u ) can be written as the sum of martingale 
differences F t (u ) — F(u ) = X)fco m *W' wnere m tW is a (p + 1) x + matrix defined 
by mtM - {E(F tiJV |.Ft-/) - E(F t , w |^ i _,_ 1 )}. By using (27) we have (E|m t (£)| 1 '/ 2 ) 2 /'' < 
JSp^. Substituting the above into B^ B N we have 

oo to— p— 1 



r\l/r 



Ht ,N = J2 £ A{I - \F(u )} k m to -k-x(£)\ a 

£=0 fc=0 



h-k 
N 



- a(u ) 



We note that \\{I - XF{u )} k \\ < K(l - XS) k (see (99)). Furthermore, if h < t 2 then 
E{m tl (l)m t2 (£)} = ¥,{m tl (l)E(m t2 (£)\J 7 tl _ e )} = 0, therefore {m t (£)} t is a sequence of 
martingale differences. Since jj^'^^ N is deterministic we can use Burkholder's inequal- 
ity (cf. Hall and Heyde [6], Theorem 12.2) to obtain 



(E\II t0 , N \ r ) 1/r <K\J2 E 



£=0 

oo 



to-p-l 



J2 {/-AFM} fc nu„-nW{a 



fc=0 
to-p-l 



tp-k 
N 



a{up)j 



r \ 1/r 



i=p 



k=P 



(WA) 



«=0 



A 1 / 2 



(51) 



Thus we have proved (ii). 

We now prove (iii). By using (41) we have 

nw\xB ( \ T (X,B),X, v T (2,B),X f \ir/2x2 



£4 



(i,B),2 
to, AT 



Oo) 



r/2x 2/r 
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E 



i=l 



r/2x 2/r 



o 



N 13 



exp{— A^o} 



(i,B),2 



We have bounded the first term of the above; to bound the rest we partition J2i=i N 
into four terms: 



Er(i,B),2 _ a b 



rjl,B . r>2,B , r>3,B 



(52) 



z=l 



where 



to-p-l 



fc=0 \i 



(i,B),l 



J -fe-l,JV 



and, for y g {1, 2, 3}, 



5 



t ,N 



to-p-l 



A{J-AF( Uo )} fc [ J Ft - fe -i(Mo)-i ;, K)]jK ) 4, J vK). 



fc=0 



We first bound w . By using (49)-(51) and (107) we have 



K 



By using (109) and (110) we have 



(E\Bi^r /r 



E 



to— p— Ito— p— k— 1 

£ E A 2 {7 - (uo)} fc+l F to __ fc „i(u ) 

fe=0 i=0 



X {M to -k-i-l,N - M to - k -i-i(uo)} 



r/2\ 2/r 



< KX. 

Using a similar proof to the above to bound (E| l^-i jv(' u o)| r ) 1/ ' r we can show 
that \\B^ B N \\f /2 = 0(\ 1/2 (NXy p ). By using (51) it is straightforward to show that 
|| J B 4 3 o '^||f /2 = 0((^A)- 2/3 + A 1 /2(iVA)-' 3 ). Substituting the above bounds for {E\A? ()N \ r / 2 ) 2 / r 



(ElB^r/ 2 ) 2 /-, (E|i3 2 n '^r/ 2 ) 2 /- and (E\B^ B N \^p r into (52), we obtain 



't ,N 



to, AT I 



(n& 2 (u ) + &\u ) + jtTiu )\ r/2 f' r = 0(S N ). 



(53) 



Finally, we prove (iii) by bounding, for y = 1,2,3, [ V ' B (uo)\\2- By using Holder's 



to.JV 
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inequality, (53), Theorem 4.1 and that {i 7 *,^} are bounded random matrices, we have 

(E|<;^K) + < 2 ;^K) + tfgVo)l 2 ) 1/2 



to-p-1 / 

< E A E 

fc=0 \ 



fk-1 



J(I-XF t0 _ iiN ) 



2r/(r-4)\ (r-4)/2r 



x (E|[F 4o _ fc _ 1)iV - F(uo)]{^l^(uo) + J t n-i,Jv(«o) + Jr - k -U}\ 

1 A 1 / 2 , 

+ A 



(NX) 2 / 3 (NX)? 



2/r 



(54) 



Since (E| j£$ ,a (tto)| 2 ) 1/a < (E| J t ^ 2 (uo)r /2 ) 2/, \ by substituting (53) and (54) into 



(52) we have 

(E|*£ w (uo) - J^'Vo) - JS^K)! 2 ) 172 = o 

and we obtain (47). 



1 Va 

(iVA) 2 ^ + (NX)! 3 



+ A 



In the following lemma we show that 8^ n (uq) is dominated by J t M ^(ito), where 

t — p— 1 

■O«o)= E A(/-AF( Uo )) fe Mt -fe(^o). 

fc=0 



□ 



(55) 



We observe that J to ' n (uq) and £t ( u o) (defined in (8)) are the same. 

Proposition 4.3. Suppose Assumption 2.1 holds with r > 4. Lei 5^ n (uq) and J t jy(uo) 
6e de/ned as m (35) and (55). Tfcen /or |i /JV-u | < l/N a™d AN > (log./V) 1+£ , /or 
some e > 0, we /lave 



(E|J t ^( U o)| r ) 1/r = 0(VA), 



(E&N) - «K)| 2 ) 1/2 = O (A + ^) 



(56) 
(57) 



Proof. Since each component of the vector sequence {A4t(uo)} is a martingale difference, 
we can use the Burkholder inequality componentwise and (99) to obtain (56). Since 
<Cjv(«o) - J to ,N( u o) = J t ,N( u o) + H™,n( u o), to prove (57) we bound J^(u ) and 
H^ n (uq). By using the same arguments given in Proposition 4.2 we also have 

Va 



(E|«K)r/ 2 ) 2 ^<A-(A + ^) 
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Finally, we obtain a bound for ff t M N (uo). By using Holder's inequality, Theorem 4.1 and 

{E\\F t0 _ k _ ltN - F(u a )\\ r/2 ) 2/r < 2(p+l) 2 we have 

« w wir<A-(A + Aj. 

Thus we have shown (57). □ 

fl 1 ( 2 S*i 1 

By using Propositions 4.2 and 4.3 we have established that J^o'n ( u o)> ^t V 
and J^'ft(uo) are the principle terms in a to N — a(uo). More precisely, we have shown 
that 

I \ T (1,B),1/ \ . T (2,B),1, \ . ,Af,l, x . D (l) / co -, 
a to ,N-a{ u o) = Ji 0tN ( u 0) + Jt o ,N l u O) + J to ,N( u O) + RN, ( 58 ) 

where | 2 ) 1/2 = 0(<5jv) (<Jjv is defined in (7)). By using (58), we show below that 

an upper bound for the mean squared error can immediately be obtained. 

Proof of Theorem 2.1. By substituting the bounds for (E| J^ ) ' 1 (u ) + J^'^'Vo)! 2 ) 172 
and (E|J 4 A o / ^(m )| 2 ) 1/2 (in Propositions 4.2 and 4.3) into (58) we have E(|fi toJV - 
a(u )| 2 ) = 0({\N)-P + A 1 / 2 + 5 N ) 2 . Thus we have (10). □ 

4.3. Proof of Theorem 2.2 

We can see from Propositions 4.2 and 4.3 that jj:^'^' 1 (uo) + jj^'fP' 2 (uo) and J^^(uo) 
are leading terms in the expansion of St ,N- For this reason, to prove Theorem 2.2 we 
need only consider these terms. Now considering the bias, we observe that 

&\vo) + & J M =nt Muo) + (59) 

where have replaced M.t,N by Mt{t/N), leading to the remainder 

to—p-X , 
R N= J2 Hl-^F(u a )} k lM t0 -kM-M t0 - k 

k=0 ^ 

By using (21) we have (E\R {2) | r ) 1 / 7 ' < K/NP, for all t,N. Therefore under Assumption 
2.1 with r > 2, if XN > (log N) 1+e for some e > 0, we have, by substituting (59) into (58), 

a to ,jv -fi(uo) =C to (u ) +Ut ,N{m) + Rn, (60) 

where (E| J R^ ) I 2 ) 1 / 2 = 0(S N ). In the proposition below we show the asymptotic normality 
of £i (uo), which we use together with (60) to obtain the asymptotic normality of a t N . 



V ■ 



406 



R. Dahlhaus and S. Subba Rao 



Proposition 4.4. Suppose Assumption 2.1 holds with r = 1 and, for some 8 > 0, 
E(Zq +a ) < oo. Let £ to («o) an d S(it) 6e defined as in (8) and (13), respectively. If 
\to/N — uq\ < 1/N, then we have 

\-V 2 £ tQ (uo)"M(0,X(u )), (61) 
where A — > as iV— > oo and AiV > (log A r ) 1+e , /or some e > 0. 

Proof. Since C to (u ) is the weighted sum of martingale differences the result follows from 
the martingale central limit theorem and the Cramer- Wold device. It is straightforward 
to check the conditional Lindeberg condition, so we omit the details. We simply note 
that by using (100) we obtain the limit of the conditional variance of £ to ( u o) : 

M4 > A \1 — \r [Uq)\ Ot a -k{u Q ) —, — rn >H u o)- (»2) 

|Af to _ fe -i(uo)|i 

□ 

Proof of Theorem 2.2. It follows from (60) that 

^ 1/2 {a t0 , N -a(u )} = \- 1/2 K to , N (u ) + \- 1/2 C ta (u Q ) + O p (\^ 2 S N ). 

Therefore, if A > AT-^/(4/3+i) and A > N -20 thcn \-^/^Xf0 _> o and X^^/N' 3 -► 
respectively, and we have 

A~ 1/2 {a t0iW - s(u )} = \- 1/2 Kt a , N (uo) + \~ 1/2 £t («o) + o p (l). 

By using the above and Proposition 4.4 we obtain (11). Finally, since \Ht a ,N(uo)\i = 
O p ((N\)-P), if A > N-WW+D thcn X-^U tOlN (u ) % and we have (12). □ 

5. An asymptotic expression for the bias under 
additional regularity 

Under additional assumptions on the smoothness of the parameters a(u), we obtain in 
this section an exact expression for the bias, which we then use to prove Theorem 2.3. 

In the section above we showed that 6? N f=a r R.t ,N{uo)- Since A4t(u) is a function on 
X t (u) 2 whose derivative exists, the derivative of A4t(u) also exists and is given by 

M t {u) = (Z 2 - l){F t _i(u)o(u) + F t -i(u)a(u)}, (63) 

where 

F t _i(«) =Y t _ 1 (u)Y t _ 1 (u) T + Y t _ 1 (u)Y t _ 1 (u) T , (64) 
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It is interesting to note that, like {A4t(u)}t, {Mt(u)}t is a sequence of vector mar- 
tingale differences. We will use Mt{u) to refine the approximation !Zt ,N(,uo) and show 
K Uu n(uo) ~Tt to ,N(uo), where 

) = X] ^(1 - AF(u )) ( wo J {Mt -fc(uo) + F(uo)a(u )}. (65) 

k=0 ^ ' 

We use this to obtain Theorem 2.3. 

Lemma 5.1. Suppose Assumption 2.2 holds with some r > 2. T/ien we have 

\M t (u) - M t (v)\x < K\u - vf\Zf + 1\L U (66) 
where L t = {1 + YX=i Wt-k} 2 , with {K{L t ) r / 2 ) 2 / r < oo, and almost surely 

M t (v) = M t {u) + (v- u)M t {u) + {u- v) 1+l3 'R t (u, v), (67) 
where \Rt(u,v) \ <L t . 

Proof. By using (63), a(u) £ Lip(l + /?') and the fact that |F t _i(u)|i < (p+ l) 2 we have 
\M t (u) - Mtiv)^ 

+ |jl_i(«)-Ji_i(t;)|i+sup|Ji_i(«)|i|tt-t;|}. (68) 

In order to bound (68), we consider Ft(u) and its derivatives. We see that \Ft-i(u) — 
Ft-x{v)\i < K\u — v\L t . thus bounding the first term on the right-hand side of (68). To 
obtain the other bounds we use (64). Now by using (22) and Lemma 4.2 we have that 
sup M |F t _i(u)|i < KL t and \F t -i(u) - F t _i(u)|i < KL\\u - vf. Altogether this verifies 

(66) . By using the Cauchy-Schwartz inequality we have that (E(Lt) r / 2 ) 2 / r < oo. 
Finally, we prove (68). Since L t is a well-defined random variable almost surely all 

the paths of A4t(u) £ Lip(l + /?'). Therefore, there exists a set AT such that P(Af) = 0, 
and for every u> € J\f c we can make a Taylor expansion of M.t{u,u>) about w and obtain 

(67) . □ 
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Lemma 5.2. Suppose Assumption 2.2 holds with r > 2. Then if \to/N — Uo\ < 1/N we 
have 



(69) 



where E(|fl^ ) | r ) 1 / r = 0(1/(N\) 1+ P'). 



Proof. To obtain the result we find the Taylor expansion of a((t—k)/N) and 
M.t-k{{t — k)/N) about uq, and substitute this into TZt ,N ■ Using Lemma 5.1, we obtain 
the desired result. □ 

In the next lemma we consider the mean and variance of the bias and stochastic terms 
^t ,Jv("o) an d £t ( u o), which we use to obtain an asymptotic expression for the bias. 
We will use the following results. Since inf u X m i n (F(u)) > C > (see (97)), we have 



t-i 



^ A{/ — XF(u)} k F(u) — > /, 



fc=0 



t-i 



£*V-AF(u)}»(£ 



fe=0 



if A — * 0, tX — > oo as t — > oo. 



O 



1 



iV 2 A/' 



(70) 
(71) 



Lemma 5.3. Suppose Assumption 2.2 holds with r > 4. Let lZt 0t N(ua), J~-t ( u o) and 
6e defined as in (8) and (13), respectively. Then if\to/N — uq\ < 1/N we have 



l(K tQ , N (u )) = --l-FK)- 1 ^) + O 



w(7Jf 0) jv(Mo)) = O 



1 



1 



k iV 2 A TV, 

E(£ to (itg)) = and var(£ to (m )) = A£(uq) + o(A). 



(72) 

(73) 
(74) 



Proof. Since {dA4k(u)/du} are martingale differences, by applying (69) to (70) we have 
(72). 

We now show (73). By using (71), sup u \da(u)/du\ < oo and sup„ |-Ft(it)| < {p + I) 2 , 
we have 

va,r{Tlt ,N(uo)} 

= ^°gf A 2 {/-AFM} 2fe (^V 
k=o ^ ' 

x vax{F to ^ k -i(u )a(uo) + Ft _fc_i(u )a(uo)} + °(^2 ) 
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o- 1 1 



7V 2 A iV 2 



It is straightforward to show (74) using (100). 

Finally, since Ct (uo) is the sum of martingale differences, E(£ to (wo)) = 0. □ 

From the above lemma it immediately follows that 

K t0 , N M = -j^F(uo)~ 1 a(u ) + P { (JVA j 1+ ^ + (75) 

and (NX)TZt 0t isr(uo) -4 — F(u ) _1 a(ito). We now use this prove Theorem 2.3. 
Proof of Theorem 2.3. By substituting (75) into (60) (with (3 = 1) we have 

&t ,N - a( u o) = ^^M^aM + C to (u ) + 5' N R^\ (76) 
where (E|i?^ | 2 ) 1/2 < oo and 

1 1 1 x 1 

S' N = 7T77TTT7F7 + TTTTTo" + TT7TTT + A + ■ 



(N\y+F (NX) 2 X 1 / 2 N N 1 / 2 ' 
By using the above and var(£ to (uo)) = A£(ito) we have 

m t0 ,N - a(u )\ 2 = Atr{E( Uo )} + ■^^\F{u )- 1 &(u )\ 2 + OdiNX)- 1 + Vx + 5' N ]5' N ), 

which gives (15). In order to prove (16) we use (76) and Proposition 4.4. We first note 
that if X- 1 / 2 /(NX) 1+l3 ' -> 0, then by using (76) we have 



^ 1/2 {a t0 ,N ~ «("o)} + A-^—f (uoj-^uo) = A" 1 / 2 /^) + o p (l). 



Therefore by using Proposition 4.4 we have (16). □ 
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Appendix 

A.l. Mixingale properties of <j)(X tj N) 

Our object in this section is to prove Proposition 4.1. We do this by using the random 
vector representation of the tvARCH process {Xt,N}t- Let 



Mu) = 















ai(u)Z t 2 a 2 (u)Z t 2 a 3 (u)Z 2 



\0 















\ 

a p _i(u)Z t 2 a p (u)Zf 





o / 



(77) 



b t (u) T = (l,ao(u)^,Oj_ 1 ). By using the definition of the tvARCH process given in (1) 
we have that the tvARCH vectors {X t .N}t satisfy the representation 



X 



t,N 



N 



N 



(78) 



Equation (78) looks like a vector autoregressive process; the difference is that A t (t/N) 
is a random matrix. However, similar to the vector autoregressive case, it can be shown 
that the product nl=o Ak{k/N) decays exponentially. It is this property which we use 
to prove Proposition 4.1. 



Let A N (t,j) = {I\lZoA t -*((t-i)/N)}, A(u,t,j) = {UlZo ^(u)} , A N (t,0) 



3-1 



and A(u,t,0) = I p +i- By expanding (78) we have 



k-l 



X t ,N = A N (t,k)X t _ ktN + ^A N (tJ)b t _ 3 

3=0 



t-j 
N 



(79) 



Suppose A is a n x m dimensional random matrix, with (i, j)th element Aij, and define 



the n x m dimensional matrix [A] q where [A] q = {(E\Af-\) 



1/9. 



1. 



,n,j = 1, 



>}■ 



Now by using Proposition 2.1 in Subba Rao [14] and Corollary A. 2 below, we have 



\\[A N (t,k)} q \\<Kp k and \\[A(u,t,k)] q \\<Kp k 



(80) 



Proof of Proposition 4.1. We first prove (24). Let C N (t, k) := Y%=1 A N (tJ)b t ^ 3 ((t - j)/N), 
that is, 

Xt,tt = A N (t, k)X t _ k , N +C N {t, k), (81) 
with AN(t, k),CN(t, k) G <r(Zt, . ■ ■ , Zt-k+i) and Xf-k,N G Ft-k- In particular, we have 

E{0(Cjv(i, fc))|^t-*)} = EMCaK*, (82) 
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Furthermore, by using Minkowski's inequality it can be shown that 

{E{\A N {t,k)X t _ k ^T t . k )} l ^<K\[A N {tM q ^k, N \i- 

The Lipschitz continuity of <j> and (79) now imply \4>(Xt,N) — <KCiv(*i k))\<K\AN(t, k)Xt-k,N\- 
Therefore, by using the above we obtain 

|E{</>(* tiJV )|^ t - fe )} - Ei^Xt.N)}^ 
= \E t _ k {<t>(X t , N ) - cf>{C N (t, k))\T t - k )} - E{cb(X t , N ) - <p{C N {t, k))}\ l 

< K(\[A N (t, fc)]i*t-fc,jvli + E{\[A N (t, fc)]i*t-MHi» 

<ir{||[^(t,fc)]i|||^- fc ,iv|i+E(||[^ J v(t,A;)]i|||Af t _ fc , JV | 1 )} 

<Kp k {l + \X t _ k>N \ 1 ) (83) 

since ElA't-^jvli is uniformly bounded, thus giving (24). The proof of (25) is the same as 
the proof above, so we omit details. Inequality (26) follows from (24) with the triangular 
inequality 

To prove (27) we use (81). Since 

E{cf>(C N (t,k))\^ k } = E{ct > (C N (t,k))} =E{^(C N (t,k))\T t ^}, 

we obtain, as in (83), 

\E{<t>(X t ,N)\Ft-k}-'&{(KXt,N)\7 r t-k-j}\ 1 

<Kp h {\X t _ ktN \ l + \X t _ ktN \ 1 ), 
which gives (27). 

To prove (28) we will use (83). We first note that by using Minkowski's inequality and 
the equivalence of norms, there exists a constant K independent of X t .N such that 

{EdA-^Ar^l^-^fe)} 1 /^ < {Ed^ljvCt, fc)^_ fe!jv |<?|^- i _ & )} 1 /'? + {M(|C w (t, fc)^!^-^^)} 1 /-?. 

Now by using {E(\A N (t, AO^t-fcJvN-^-fc)} 179 < Kp k \X t - k , N \i and 

fc-i 

[E\C N {t,k)\ q \Ft-k] 1/q <KY, 

3=0 

we have 

f K } q 

E{|* t jvH^-fc} < iKp^Xt^U + — p ) ^ K\X t _ h , N \\, 

hence we obtain (28). □ 
We use the corollary below to prove Lemma A. 7 at the end of Section A. 3. 



[A N (t,j)] q 



-t-j 



t-j 
N 



< 



K 
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Corollary A.l. Suppose Assumption 2.1 holds with r = qo and E(Z^ qa ) < oo. Let 
f,g:W +1 — > M(p+ 1 ) x (p+ 1 ) be Lipschitz continuous functions, such that for all positive 
xeW +1 , | /(x) |abs < Ip+i and \g(x)\ ahs <I p+1 , where l p+1 is a (p + 1) X (p + 1) matrix 
with one in all the elements, and |A| a b s = flA^I: i,j = l,...,p+ 1). Then for q<qo we 
have 

[mi(Zl k+1 - l)f{X t ^ N )g{X t {u))\T t ^ 3 } 

E{(Zl k+1 - l)f(X t ^ N )g(X t (u))}) q ] 1/q 
< K{E(Zt) + lymXt-k-j, N\ q ) 1/q + (ElXt^W) 1 ^) (84) 

and 

[E(E{(Zl k+1 - l)f{X t _ k {u))g{X t {u))\T t - k - 3 } 

-E{(Zl k+1 -l)f(X t _ k (u))g(X t (u))}y] 1/q 
< K{E(Z±) + l}f?({E\Xt-k- 3 iu)\< i y>< 1 + (ElA^k-iMI 9 ) 1 '*), (85) 
for j, k>0, where p is such that 1 — n < p < 1 . 

Proof. We give the proof of (84) only; that of (85) is the same. We use the notation 
introduced in the proof of Proposition 4.1 and let C(u, t, k) = J2j=o A(u, t,j)b t _j(u), that 
is, X t (u) = A(u,t,k + j)X t -k-j(u) +C(u,t,k + j). 

Wc use (82). Since |/(x)| a bs < \+i and |g(x)| abs < I p +i w e have 

\(Z?_ k+1 -l)(f(X t _ k , N )g(X t (u))-f(C N (t-k,3))g(C(u,t,k + m 

< \Z 2 t _ k+1 + l\{\A N {t - fc,j)-**-fc-j,JsrIi + |4(u,t,fc+ j)*t-fc-i(u)|i). (86) 

Since A(u,t, k — 1), {Zf_ k + l)A t - k -i(u) and A(u,t — k,j) are independent matrices and 
by using (80), we have 

E\\[{Z 2 t _ k+ i + l)A{u,t,k + 3)]i\\ 

< 3E|| *, fc — l)]i|||| [(^_ fc+ i -I- *-HiC«)]ilUII[-4-C«»* — 

<iO£(Z t Vi + 1 )/ + ^ 1 - (87) 

Considering the conditional expectation of (86), by using (87) and \Et-k-j{(Zf_ k+1 — 
l)Aff(t - k,j)Xt-k-j,N}\ < Kp>\X t -k-j,N\i^ we nave 

|E[(Z 4 2 _ fc+1 -l){/(A' t _ fc , w ) 5 (A' t ( W ))-/(C A r(t-fc,.7)) 3 (C(u,t,fc + j))}|^_ fe _ 1 ]| 

< K(P\X t _ h _^ N \ x E{Zl_ k+x + l)+KE(Zf_ k+1 + ^p^-Vt-fe-iWIi 

< A'E(Z 4 + l)^'(|Ai_ fc _ AAr |i + |Ai_ fc _j(«)|i). 
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Using similar arguments, we obtain the bound 

(E\E[(Zl k+1 - l){f(X t _ k , N )g(X t (u)) - f(C N (t-k,j))g(C(u,t,k + j))}]\ r ) 1/r 
< KE(Z* + l)p*(E(|A4_ fc _ i , jv| r ) 1/r + (nXt-k-j(u)\ r ) 1/r ): 
leading to the result. □ 



A. 2. Persistence of excitation 



As we mentioned in Section 1, a crucial component in the analysis of many stochastic, 
recursive algorithms is to establish persistence of excitation of the transition random 
matrix in the algorithm: in our case this implies showing Theorem 4.1. Intuitively, it is 
clear that the verification of this result depends on showing that the smallest eigenvalue 
of the conditional expectation of of the semi-positive definite matrix X t ^N^ N /\\ X t ,N\\\ 
is bounded away from zero. In particular this is one of the major conditions given in 
Moulines et al. ([11], Theorem 16), where conditions were given under which persistence 
of excitation can be established. In fact in this section we verify the conditions in Moulines 
et al. ([11], Theorem 16) to prove Theorem 4.1. 

Suppose X is a random variable and define E t (X) = E(X\!Ft). 



Lemma A.l. Suppose that Assumption 2.1 holds with r = 2. Then we have 

\ min (E{X t {u)X t (u) T \T t _ k }) > C 



and, for N large enough, 



A min (E{A' t , Ar A' t y Ar |^ t _ fe }) > -, (89) 



for k>(p+ I), where C is a finite constant independent of t,N and u. 

Proof. We will prove (89); the proof of (88) is similar. We partition X t ^Xj N as 

X t , N X t i ; N = A 1 +A 2 , (90) 



where Ai and A2 are positive definite matrices, Ai € o~(Z t , ■ ■ ■ , Z t ~ p ) and A2 €E Tf This 
implies A m i n {E f _fc(A' ti Ar< ; f t T ;v )} > A min {E(Ai)}, if k >P + 1, which allows us to obtain a 
uniform lower bound for \ m i n {Et-k(Xt,NX^ N )}. To facilitate this we represent Xt.N in 
terms of martingale vectors. By using (1) we have 



Y 2 

A t,7V 



: a 



X 



t-i,N 



(z? 1K 2 



N 
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and 



N 



j Xt-i,N + {Zt - l)af N D, 



(91) 



where D is a (p + l)-dimensional vector with Di = if i =/= 2 and D 2 = 1, and 



/ 1 

oo(u) oi(w) a 2 (u) 



e(«) 



00 ^ 

. .. Op_i(«) a p (u) 



V 

With p + 1 iterations of (91) 

p 



i=0 



0=0 



0=0 






J 



t-j 



N 



Let M4 = E(Z t 2 - l) 2 , 6 t)i v(i)=ni=o©((*-i)/^) and B*,iv(«) =e t>N (i)DD T e t , N (i) T . 
Since {(^ 2 — l)of w }t arc martingale differences, we have 

p 

= [l i J2 E t-k( (T t-i,N)®tA i ) DDT@ t,N(i) T 
i=0 

+ e t , A r(p+l)E t _ fc (A' t _ p _ 1 ^A' ( l p _ 1Ar )e t , A r(p+l) T forfc>p+l. (92) 
Since the matrices above are non-negative definite, we have that 



{K(X t ,NX t jy-|.F t _fc)} > A m i n < /Z4 



We now refine H4j2 P =o Et -k( a t-iN)Bt,N(i) to obtain Ai. By using (79) we have 



(93) 



3=0 



\ — ^— ) +AN(t,p+ l)X t - P -i, 



Therefore, by using (1) and X^_ i N = (Xt^i+i, we have 



't-i,N 



1 P { 



* - i M 1 

+ -^2 — {A-N{t,p+ l)Af t _p_x,jv}i+l 



Ht-i,N(t) + Gt-i,N\t), 



(94) 
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where H t _^ N (t) <G a(Z t -i, Z t - P ) and G t -i,isr(t) G Ft-i- Since H t -i,N(t) and G t -i,isr(t) 
are positive this implies, with (93), 



where 



{E(Xt, N X? N \r t -k)} > A mi „{P t ,jv}, 



Pt,N = fXiJ2E{Ht-i, N (tf)B t , N {i 



To bound this we define the corresponding terms for the stationary approximation Xt (u) . 
We set 



P(u) =fi 4 j2 E ( H t-i(u, t) 2 )B(u, i), 



i=0 



where 



1 P 



l ~ l j=0 

and B(u,i) = Q(uYDD T (<d(u) l ) T . A close inspection of the above calculation steps re- 
veals that 

p(u)=ny P (u)Y p (u) T } 

with Y_ (u) from Assumption 2.1(iv). Therefore, 

Amin ( P (^) I " ^ f Amin{F(u)} " °- 



Since {oj (-)}j is /3-Lipschitz, we have, for i = 0, . . . ,p — 1, 
E(# t _/^,i) ) -E(iJ t _^(i) 2 ) 



< 



if 



Bt,N(i)-B[ -,i 



< 



K 

w 



Therefore \\P t<N - P(t/N)\\ < \P t N - P(t/A)|i < K/N?, which leads to 



416 



R. Dahlhaus and S. Subba Rao 



and therefore to 

KuUnXt,NXj N \T t ^ N }) >C--^. (95) 

Thus for TV large enough we have (89). □ 

Lemma A. 2. Suppose that Assumption 2.1 holds with r = 4. Then, for k>p+l, we 
have for N large enough, 



Xt,N<¥ t ,N 



Ft-k \ ) > , v ° , 4 (96) 



l^tiJvIl )) \Xt-k,N\\ 



X t {u)X t {u) 
\Xt{u)\{ 



where C is a constant independent of t,N and it. 
Proof. We first prove (96). By definition, 



l^f)]= inf Et _ fe K^ V 



l^t.Jvli / J l£l = 1 V 



N 1 



Since 



(x X t ,N) = -pn T~X X t ,N\Xt,N\l 

and sup|.j,| =1 \x Xt,N\x < |^t,j\r|i, we obtain by using Cauchy's inequality and (28), 

E t - fe ( £ T A- t ,,v) 2 < Ie^^^^) 2 !^!^^^!^!!) 2 } 1 / 2 



Therefore by using the above and Lemma A.l for large N, we obtain 
inf E t _ J^L] 2 > inf [ ^MMl> ° 



\x\=i \ \X tt N\i J \x\=i ~E t -k(\X t ,N\i) \Xt- 

where C is a positive constant, thus giving (96). 
To prove (97), we use (88) to obtain 

E{X t (u)X t (u) T ) = ^ 00 (X t {u)X t (u) T ) > C, 



k,N\\ 
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and using the arguments above we have 



\X,{u)\\ M E(|^_,(«)|{) ' 
By Lemma 4.1 sup u E(|At(u)|f) < oo, which leads to (97). □ 

Corollary A. 2. Suppose Assumption 2.1 holds with r = 4. Let F(u) be defined as in 
(9). Then there exist C and X\ such that, for all A 6 [0, Ai] and u g (0, 1], we have 

A raax {/ - XF(u)} < 1 — AC. (98) 

There exist a < 5 < C and K such that, for all k, 

\\{I - XF(u)} k \\ <K(1-X6) k . (99) 

Furthermore, 

t-i 

^2 M-f — XF(u)} 2k — > (100) 
where X — > and Ai — ► oo as t oo. 

Proof. Inequality (98) follows directly from (97). Furthermore, since (7 — XF(u)) is 
symmetric matrix, we have \\{I - XF(u)} k \\ < \\ (I - XF(u))\\ k < (1 - A<5) fe , that is, (99). 
We now prove (100). We have 

t-l 

J A/»} 2fe = X{I -(I- XF^fy'il -(I- *F(u)f}. 

k=0 

Since X min {F(u)} > C, for some C > 0, we have \{I- AF(u)} 2t |i < \\{I - XF{u)} 2t \\\I p+1 \ - 
as A 0, Xt -> oo and t -► oo. Furthermore, A{J - (7 - AF(it)) 2 } -1 -> |-F(u) _1 . To- 
gether these give (100). □ 

Lemma A. 3. Suppose that Assumption 2.1 holds with r = 4. For a sufficiently large 
N and for every R > 0, there exist an sq > p + 1 and C\ such that, for all s > sq and 
t = 1, . . . , N — s, we have 



where I denotes the identity function. 
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Proof. The result can be proved using the methods given in Moulines et al. ([11], 
Lemma 19). We outline the proof. By using A m i n (j4) > A m i n (B) — \\A — B\\ (see Moulines 
et al. [11], Lemma 19), we have 



n+s-i 

E 

V k=t 



XkNXj 



k.N 



l^fc.ivli 



t+s-1 

> Amin< i ] 



k=t 



t+s-1 

E< 

k=t 



— ■ ~- - E t 



\Xk,N\l 



\XkMi 

We now evaluate an upper bound for the second term above. Let 

V yT / v yT 



X k,NX^ N 
\Xk,N\l 



\Xh,n\1 



E, 



V l^fc.Jvli 



Then 





t+s-1 


2 


t+s-1 


2 p+1 t+s-1 


Et 


E 


<E t 


E 


^ E E Mi^MkA^^N)* 




k=t 




k=t 


— l ki,k2—t 



and we require a bound on E t ((Afe 1 ,N)i,j{^k 2 ,N)i.j)- F° r fei — ^2 = k we obtain with (28), 

E t {(A fe>iv ) i , j } 2 < 2E t |Afc,jv|f < K\X t , N \i- (102) 

Now let fci 7^ fc2- Let </>(x) = 2L T 2;/|ie|i, where x — (l,x\, . . . ,x p ). Since G Lip(l), by 
using (27) we have, for k\ < hi , 



|Efe! ((Afc 2 ,iv)i,j)| 



Ei 



^ V 1^2, JV 1 1 7 'V l^fea.JvI? 
< ^-^(Etd^^li) + |^ IjJV |l). 
By using (28), (102) and (103) we have 

|E t {(A fcl , JV ) lJ (A fe2 , JV ) iJ }| < |E t {E fcl (A fel , JV ) JJ (A fc2 , JV ) lJ }| 

< |E t {(A fcl ^) iJ E fel (A fc2 , JV ) lJ }| 

<{E t {^ ku N)l j } 1/ \E t {E k ^,N)i, 3 ?} l/2 
<Kp k *-^\X ttN \\<Kp k *-^\X ttN \i 

t+s-1 



(103) 



(104) 



Substituting (104) into (101), we obtain E t || Y?k=t A 



k ,N\\ 2 = Ks\X tfN \f, where K is a 
constant independent of s. Therefore, by using this and (96), we have, for N sufficiently 
large 



Etl K 



H+S-l y -yT 



>^\X, N \^-K{ S \X t , N \iV /2 - 
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Therefore for any R > 0, 

E'^jjf )} ^ ^(^ 1/2 -A-imi*^ii<#)- 

Now choose So and a corresponding Ci such that 

1/2 

Then it is clear that if s > sq then we have 



Ci = ^(^o /2 -^i? 2 )>0. 



E *{ A -in ( E 1 ) } ^ ^( CS1/2 - ^d^ll ^ R ) 

>CiI(\X t , N \ 1 <R), 

thus giving the result. □ 

Proof of Theorem 4.1. We prove the result by verifying the conditions in Moulincs 
et al. ([11], Theorem 16); see also Dahlhaus and Subba Rao [3]. Let <fii := |A^jv|i, 
Vi := \Xi h\i and Ai := F( n = ^,jv^ jv/I^.aHi- We now show that conditions (a)-(d) in 
Moulincs al. ([11], Theorem 16) are satisfied. Let A: >p + 1 and 1 < s < N — k. Then 
by using (26) we have E(Vt+ s \Ft) < K(l + p s \X t ,N\i)- Therefore for any R\ and s > 0, 
we have 

E(V t+s \Ft) < \kp s + ^I{\X t , N \i > Ri)^\X t ,N\i+KI{\X t , N \i < Ri)- 
Thus wc have 

E(Vt+M < [Kp s + ^J/d^li > Ri)}\Xt, N \i + K( P S + l)I(\X t , N \i < Ri)\Xt, N \i- 

By choosing an appropriate R\ we can find an s\ such that, for all s > Si, we have 
7\ p s + A'/ i?i < 1 and thus condition (a) is satisfied. Condition (b) directly follows from 
Lemma A. 3. Let be a (p+ 1) x (p + 1) matrix where (I p+ i)y = 1 for 1 < i, j < (p+ 1). 
Since H-F^wll = X^n^^n /\Xi,n\i = 1, f° r A < 1, we have AHi^jvll < 1, hence condition (c) 
is satisfied. Finally, by using the above for any q > 1 and t G {N — p, . . . , t}, we have 



r+si 

E 

I=i 



E(||F^||^)<fci(P + l) 29 . 



Therefore condition (d) is satisfied and Theorem 4.1 follows from Moulines et al. ([11], 
Theorem 16). □ 
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A. 3. The lower-order terms in the pertubation expansion 

In this section we will prove the auxiliary results required in Section 4.2, where we showed 
that the second-order terms in the pertubation expansion were a of lower order than the 
principle terms. The analysis of J^' 2 N is based on partitioning it into two terms, 



x,2 
t ,N 



ax i r>x 
A t ,N + D t ,Ni 



similar to the partition in (52). A* N is the weighted sum of {F^.n — Fk{u)}, whereas 
Bf N is the weighted sum of the differences between the stationary approximation (uq) 
and E(i 7 o( M o))i that is, of Fk(uo) = Fk(uo) — F(uq). In this section we evaluate bounds 
for these two terms. We require the following lemma. 

Lemma A. 4. Suppose Assumption 2.1 holds with some r > 1. Let A4t,N md M. t {u) be 
defined as in (29) and let Dt.N be defined as in (45). Then we have 



X t , N X t J N X t {u)X t {u) 



+ Rt,N{u) 



and 



where 



M t ,N = M t (u) + (1 + Z t 2 )i? tiW (u), 



(105) 



\Rt,N(u)\ < 



N 



D 



t.N- 



Further, for q < qo there exists a constant K independent of t, N and u such that 



(E\Rt iN (u)\ q ) 1/q <K 



t 

N 



p+l 

N 



(106) 



Proof. The proof uses (21) and the method given in Dahlhaus and Subba Rao ([4], 
Lemma A. 4). □ 



We now give a bound for a general A* 



t,N- 



Lemma A. 5. Suppose Assumption 2.1 holds with r = qo and let {G^at} be a random 
process which satisfies sup t N \\Gt t N\\f < oo. Let F t .N , F t (u) and F(u) be defined as in 
(30) and (9), respectively. Then if \to/N — uq\ < l/N and q < qo we have 



E 



to— p— 1 

£ (7- \F(u )) k (F to _ k _ hN - F to _ fe _ 1 (n ))G to _ fe 

fc=0 

K 



N 



9/2\ 2/q 



(107) 
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where K is a finite constant. 

Proof. By using (99) and (105) we obtain the result. 



□ 



In order to bound E|_B* N \ q we need a Burkholder type inequality (using Minkowski's 
inequality is not sufficient). This inequality is embedded in the following lemma which is 
a generalization of Proposition B.3 in Moulines et al. [11]. It can be proved by adapting 
the proof in Moulines et al. [11]; see also Dahlhaus and Subba Rao [3]. 

Lemma A. 6. Suppose {M t } and {F t } are random matrices and F is a positive definite, 
deterministic matrix, with X min (F) > 5 , for some 6 > 0. Let J- t = o~(F t , M t , F t _i, M t _i, . . .). 
Assume, for some q>2, the following: 

(i) {F t } are identically distributed with mean zero. 

(ii) E(M t \F t -i) = 0. 

(hi) (E|E(F t |^.)| 2 '?)V2 9 <^-fe 

(iv) (E|E(M t F s |^ fc ) - E(M t F s )\i) 1 /i < Kp s - k ifk<s<t. 

(v) sup t (E|M t | 2 9) 1 /( 2 «) <oo and (E|F t | 2< ?) 1 /( 2 «) < oo. 



The 



have 



E 



t-p-l t-p-fc-2 



t-k-lMt-k-i-l 



g\ 1/9 



< 



K 
6X' 



(108) 



where K is a finite constant. 

We now apply the lemma above to the particular example of the ANRE algorithm. 

Lemma A. 7. Suppose Assumption 2.1 holds for r > 4. Let {F t (u)}, {A4t.N} o,nd 
{M t (u)} be defined as in (29) and F t (u) = F t (u) -E(F t (u)). Then 



E 



to— p— 1 to—p — k—2 



k=0 



i=0 



r/2-. 2/r 



<§ (109) 



and 



E 



^ o — p— 1 tft—p — k — 2 
k=0 i=0 



k-i-1 



(«) 



r/2s 2/r 



<f. (110) 



Proof. We prove (109); the proof of (110) is the same. We verify the conditions in Lemma 
A. 6, then (109) immediately follows. By using (97) we have that X mnl {F(u)} > S, for some 
5>Q. Let M t := M t , N , F t := F t (u), F := F(u) and T t = a(Z t , Z t -i, . . .). It is clear from 



422 



R. Dahlhaus and S. Subba Rao 



the definition that the {F t (u)}t have zero mean and are identically distributed; also 
E(Mt{u)\J 7 t -i) = Op+ixp+i- By using (24) we have 

(E|E(^( U )|JF fe )f)V- = ( E | E (^( u )|^)_ E (iP t ( u ))p)Vr<^ p t- fcj 

thus condition (iii) is satisfied. Since M t ,N — (Zf — l)of N Xt-i,N /\Xt-i,N\\, an d F t < 
Ip+i and fj^ j\j^t—i,N 1 1 ^t—i.N 1 1 Ip+i j by using Corollary A.l and sup t N {E\X t . N \ r / 2 ) 2 / r < 
oo, we can show that condition (iv) is satisfied. Moreover, F t .N is a bounded random 
matrix, hence all its moments exist. Finally, since sup t N \A4t,N\r < K(Zf + l) r and 
E(Z 2r ) < oo, we have, for all k < s <t < N, that sup tJV (E|A / Jt,Ar| r ) < oo, leading to 
condition (v). Thus all the conditions of Lemma A. 6 are satisfied and we obtain (109). □ 
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